Abstract
Introduction: Sickle cell disease (SCD) is a genetic blood disorder affecting millions worldwide, requiring early and accurate diagnosis to prevent complications. Traditional diagnostic methods, such as manual microscopy and hemoglobin electrophoresis, are time-consuming and labor-intensive. Artificial intelligence (AI) and machine learning (ML) offer promising solutions for automating SCD detection from blood smears, improving efficiency and accessibility. However, the performance and reliability of these AI/ML models vary widely across studies. This meta-analysis evaluates the diagnostic accuracy of AI/ML in detecting SCD from blood smears, analyzing 11 studies to compare algorithms, datasets, and clinical applicability. Our findings aim to guide future AI integration into SCD diagnostics, addressing gaps in validation and standardization.
Methods: A systematic search of MEDLINE, EMBASE, and CENTRAL was conducted for studies on ML-based SCD diagnosis up to August 2025. Meta-analysis was performed using Stata 18.0, and QUADAS-2 assessed risk of bias.
Result: A total of 800 studies were screened, with 21 meeting the inclusion criteria for this meta-analysis on AI/ML models for SCD diagnosis, based on a dataset of 25,428 images. The results showed high efficacy, with accuracies consistently exceeding 98%. Notably, a deep convolutional neural network (CNN) achieved over 99% (95% CI, 85.00 to 99.99) accuracy, and a study with a small dataset of 30 images reached 99.4% accuracy (95% CI, 82.68 to 99.89). Sample sizes varied from 30 to 10,000 images, with a smartphone-based clinical system using 3,199 cells from 57 patients reporting 98.9% accuracy (95% CI, 62.56 to 99.9%). Overall, the sensitivity and specificity were 96.4% (95% CI, 72.85 to 98.44) and 98.3% (95% CI, 89.44 to 97.23), respectively. Direct comparisons showed deep learning models (CNN/ResNet) outperforming traditional machine learning models (SVM/Random Forest) with an accuracy of 92.68% (95% CI, 62 to 94.22). However, the analysis was limited by insufficient data to determine whether public datasets outperform clinical data or if AI/ML models can differentiate SCD subtypes. Additionally, concerns about overfitting in small sample sizes highlight the need for larger, standardized research with validated datasets to fully assess model performance and biases. The quality assessment using QUADAS-2 indicated moderate risk of bias across the included studies.
Conclusion: AI/ML models demonstrate high diagnostic accuracy for SCD, with deep learning models outperforming traditional approaches. However, further research with larger, standardized datasets is needed to address validation gaps and potential biases.